perm filename CDOC[CMP,LSP] blob
sn#208723 filedate 1976-03-31 generic text, type C, neo UTF8
COMMENT ā VALID 00006 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 A Brief Description of the Stanford Lisp Compiler
C00005 00003 Fundamental Table Dispatch Mechanism
C00009 00004 Top Level
C00013 00005 Pass One
C00017 00006 Pass Two
C00019 ENDMK
Cā;
A Brief Description of the Stanford Lisp Compiler
This document will attempt to give a brief explanation of the
fundamental structure and behavior of the Stanford Lisp compiler.
Throughout the operation of the Lisp compiler, as described in the
Lisp 1.6 manual will be assumed known. In all explicit descriptions
below, the compiler will be taken to be operation on a disk file to
produce another disk file, as this is the fundamental mode of
operation.
Organisation
The compiler is divided into three major parts. Toplevel,
the first of these is concerned with processing user commands,
handling files and in general providing the framework within which
actual compiling takes place. The remaining two parts are concerned
with compilation itself. The first, Pass1, preprocesses the
expression, putting certain things in normal forms, expanding macros
and recording information in tables. The second, Pass2, operating on
the output of Pass1, preforms the final code generation. In
addition, there are verious subsidiary pieces dealing with IO,
debugging and general subroutine support. The three major pieces will
be discussed after a brief digression to consider a basic mechanism
which is used at several places in the compiler.
Fundamental Table Dispatch Mechanism
It is a common process in Lisp programs to process a
S-expression with an atomic car by checking for certain set atoms or
certain classes of atoms and calling routines corresponding to these
various cases. For example, a routine to evaluate arithmetic
expressions might, after first check for an atomic expression, test
to see if car of the expression were one of plus, times etc. and
calling routines to add or multiply as appropriate. In a slightly
more complicated arithmetic interpreter, there might also be certain
classes. Sin, Cosine and Tangent might all lead to the invocation of
the same routine for processing trigonometric functions. The
disadvantages of this technique are twofold. First, it is
inflexible, and second, it is, in certain ways, inefficient. The
latter is a problem of growth. Though the tests are not time
consuming, the total time increases with their number. Much more
important, is the fact that when adding or modifying the behavior of
a process written in this way, the original code must be altered even
for a pure addtion. This may necessitate recompiling or other time
consuming processes. The technique which will now be described
attempts to systematize the process of dispatching on the car of an
expression in such a way as to remove these disadvantages.
Given a non-atomic expression with an atomic car to be
processed, our basic intention is to search the property list of the
car for some property of interest. Rather than having a list of the
properties which are of interest, however, the emphasis on
flexibility can be pushed still furtther by distinguishing
interesting properties by examining their own property lists. Thus
given an atom which is car of an expression, each indicator on its
property list is examined to see if it in turn has a special
property. This last property is not a seat of flexibility but occurs
explicitly in the code. The routine in the compler which
accomplishes this is called GETGET. In the following sections, three
explicit cases of its use will be discussed.
Top Level
When the COMPL command is given, causing the compilation of a
file, the compiler must process each expression in the file
appropriately. The file may contain functions defined by
S-expressions, functions defined by Lap, declarations affecting the
compiler, expressions to be evaluated at load time, comments etc.
Each of these things must be processed differently. The function
definitions must be compiled into Lap, the Lap must be examined to
note the types of the functions it defines, the declarations must be
processed for their intended effects, the runtime S-expressions must
be printed out unchanged, and the comments must be absorbed and
ignored. Further, all of these things must be done in such a way that
changes and additions may be easily made.
After all of the overhead of deciphering the user's command
and preparing for a file processing operation has been put by, the
compiler settles down to considering each expression in the input
file to decide on appropriate action. The routine which does this is
called COMPEFFECT. If the expression is atomic, it is simply printed
and no further action is taken. If it is not atomic, but has an
atomic car, the function GETGET is called with the property
COMPEFFECT in mind. If the car of the expression has a property,
which in turn has the COMPEFFECT property then the entire
S-expression is handed over to the function which is the value of
this COMPEFFECT property. At present, in the case of the top level
there are only two atoms which have the COMPEFFECT property. The
first of these is the atom MACRO. Thus if car of an expression is a
macro, then control will be given to the function ACTONMACRO which
will be found as the CCOMPEFFECT property of the atom MACRO. The
function ACTONMACRO will then make use of the macro definition to
expand the macro and then apply the function ACTONEXPR to the result.
The other distinguished property which will be uncovered by ACTONEXPR
is COMPACTION. This property indicates that the atom in question is
to be dealt with specially. The value of the COMPEFFECT property of
the atom COMPACTION is a simple function which finds the COMPACTION
property of the original atom and hands it control.
Pass One
Pass one is the first part of compiling proper. As noted
above, it will put the expression in normal form by such acts as
expanding macros and recording information for the use of the second
pass. As with the top level, control is passed around among
functions which occur as properties of various atoms and are
discovered by the operation of GETGET. The property given to GETGET
in this case is PASS1. Several atoms have the PASS1 property
including SUBR, FSUBR, MACRO and their relatives. As with the
COMPEFFECTs of the top level, there is a property which indicates
that an expression is to receive special treatment. This property is
P1.
The information kept by pass one is primarily about the
variables occurring in the expression. Variables are divided into
two classes, SPECIAL and LOCAL. Into the former go all variables
whether bound or not which are know to the compiler as special
variables from either declarations or prior free occurrence. The
local variables are kept on a separate list and certain information
is kept about them.
During the process of compilation, it is often necessary to
refer to quantities which are old values of variables. Rather than
assigning to each such quantity a separate name and treating is as a
peculiar sort of variable, the compiler keeps a count during
compilation which is used to record the progress of events. This
count which is set to zero at the beginning of each pass is
incremented each time that any variable is referenced in any way, and
at certain other places. It is desired to guarantee that if a
variable takes on two different values, it may not do so at one value
of the count. The count thus serves the function of a sort of "time"
within the compilation.
The first pass, in addition to maintaining the count, will
record for each local variable the last value of the count at which
it was seen. Thus if the count during the second pass should ever
become larger that that associated with some variable, then the
variable will never be used again. This allows the space it occupied
to be used for something else.
Pass Two
The fundamental processes of the compiler take place in pass
two. Most of the complexity of this section is due to a desire to
exploit the architecture of the PDP-10 by placing function arguments
in the accumulators. In order to avoid the orgy of pushing and
popping the most frequent action in which the compiler must engage is
that of compiling the arguments to a function, then applying the
function to the arguments. If this were done in the straight forward
way in which it would be done if arguments were placed on the stack,
the need to maintain transparancy would result in an orgy of pushing
and popping which would paralyse the computation. Instead the
compiler follows a more sophisticated and vastly more confusing plan.
It compiles in turn each argument without loading it into any
particualar place. This amounts to guaranteeing that the desired
result exists somewhere in the machine and making note that it may be
moved but not damaged. Finally, after all of a functions arguments
have been compiled, they are loaded into the first few accumulators
and the funtion is called.